Town trip forecasting based on data mining techniques
نویسندگان
چکیده مقاله:
In this paper, a data mining approach is proposed for duration prediction of the town trips (travel time) in New York City. In this regard, at first, two novel approaches, including a mathematical and a statistical approach, are proposed for grouping categorical variables with a huge number of levels. The proposed approaches work based on the cost matrix generated by repetitive post-hoc tests for different pairs. Then, a random forest model is constructed for the prediction of the type of trips, short or long. Finally, based on the trip type and each of the mathematical and statistical approaches, separate artificial neural networks (ANN) are developed to predict the duration time of the trips. According to the results, the mathematical approach performs better and provides more accurate results than the statistical approach. In addition, the proposed methods are compared with some other methods in the literature in which the results show that they perform better than all other methods. The RMSE of mathematical and statistical approaches is, respectively, 4.23 and 4.27 minutes for short trips, and the related value is 9.5 minutes for long trips. In addition, a modified version of the nearest neighborhood approach, entitled modified nearest neighborhood (MNN), is proposed for the prediction of the trip duration. This model resulted in accurate predictions where its RMSE is 4.45 minutes.
منابع مشابه
mortality forecasting based on lee-carter model
over the past decades a number of approaches have been applied for forecasting mortality. in 1992, a new method for long-run forecast of the level and age pattern of mortality was published by lee and carter. this method was welcomed by many authors so it was extended through a wider class of generalized, parametric and nonlinear model. this model represents one of the most influential recent d...
15 صفحه اولForecasting of Crops Using Data Mining Techniques
The past two decades has seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation of data has taken place at an explosive rate. It has been estimated that the amount of information in the world doubles every 20 months and the size and number of databases are increasing even faster. Data storage became easier as the availability of large ...
متن کاملForecasting Stock Trend by Data Mining Algorithm
Stock trend forecasting is a one of the main factors in choosing the best investment, hence prediction and comparison of different firms’ stock trend is one method for improving investment process. Stockholders need information for forecasting firm’s stock trend in order to make decision about firms’ stock trading. In this study stock trend, forecasting performs by data mining algorithm. It sho...
متن کاملForecasting Gold Price using Data Mining Techniques by Considering New Factors
Gold price forecast is of great importance. Many models were presented by researchers to forecast gold price. It seems that although different models could forecast gold price under different conditions, the new factors affecting gold price forecast have a significant importance and effect on the increase of forecast accuracy. In this paper, different factors were studied in comparison to the p...
متن کاملForecasting Of Tehran Stock Exchange Index by Using Data Mining Approach Based on Artificial Intelligence Algorithms
Uncertainty in the capital market means the difference between the expected values and the amounts that actually occur. Designing different analytical and forecasting methods in the capital market is also less likely due to the high amount of this and the need to know future prices with greater certainty or uncertainty. In order to capitalize on the capital market, investors have always sough...
متن کاملthe clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance
با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...
منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ذخیره در منابع من قبلا به منابع من ذحیره شده{@ msg_add @}
عنوان ژورنال
دوره 16 شماره 1
صفحات 1- 13
تاریخ انتشار 2020-03-01
با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023